Mining Interesting Itemsets using Submodular Optimization

نویسنده

Jaroslav Fowkes

چکیده

We propose a novel technique to retrieve itemsets that best explain a transaction database by leveraging a simple probabilistic model. Our approach is the first to infer such interesting itemsets directly from the transaction database using submodular function optimization and in so doing avoids many of the pitfalls commonly present in frequent itemset mining algorithms. Our proposed approach is theoretically simple, straightforward to implement, trivially parallelizable and exhibits good performance as we demonstrate on both synthetic and real-world examples.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Approach for Mining Top-Rank-k Erasable Itemsets

Erasable itemset mining first introduced in 2009 is an interesting variation of pattern mining. The managers can use the erasable itemsets for planning production plan of the factory. Besides the problem of mining erasable itemsets, the problem of mining top-rank-k erasable itemsets is an interesting and practical problem. In this paper, we first propose a new structure, call dPID_List and two ...

متن کامل

On Mining Max Frequent Generalized Itemsets

A fundamental task of data mining is to mine frequent itemsets. Since the number of frequent itemsets may be large, a compact representation, namely the max frequent itemsets, has been introduced. On the other hand, the concept of generalized itemsets was proposed. Here, the items form a taxonomy. Although the transactional database only contains items in the leaf level of the taxonomy, a gener...

متن کامل

Efficient Computation of Partial-Support for Mining Interesting Itemsets

Mining interesting itemsets is a popular topic in the data mining community. The objective of this problem is to mine all interesting itemsets, with respect to a given interestingness measure. While considerable efforts have being spent on justifying the various interestingness measures, the algorithms that mine them are not quite well-studied, except in the case support, which has resulted in ...

متن کامل

Depth-First Non-Derivable Itemset Mining

Mining frequent itemsets is one of the main problems in data mining. Much effort went into developing efficient and scalable algorithms for this problem. When the support threshold is set too low, however, or the data is highly correlated, the number of frequent itemsets can become too large, independently of the algorithm used. Therefore, it is often more interesting to mine a reduced collecti...

متن کامل

Mining Frequent Itemsets Using Support Constraints

Interesting patterns often occur at varied levels of support. The classic association mining based on a uniform minimum support, such as Apriori, either misses interesting patterns of low support or suuers from the bottleneck of itemset generation. A better solution is to exploit support constraints, which specify what minimum support is required for what itemsets, so that only necessary itemse...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2014

Mining Interesting Itemsets using Submodular Optimization

نویسنده

چکیده

منابع مشابه

A New Approach for Mining Top-Rank-k Erasable Itemsets

On Mining Max Frequent Generalized Itemsets

Efficient Computation of Partial-Support for Mining Interesting Itemsets

Depth-First Non-Derivable Itemset Mining

Mining Frequent Itemsets Using Support Constraints

عنوان ژورنال:

اشتراک گذاری